Variable Selection Via Thompson Sampling

نویسندگان

چکیده

–Thompson sampling is a heuristic algorithm for the multi-armed bandit problem which has long tradition in machine learning. The Bayesian spirit sense that it selects arms based on posterior samples of reward probabilities each arm. By forging connection between combinatorial binary bandits and spike-and-slab variable selection, we propose stochastic optimization approach to subset selection called Thompson (TVS). TVS framework interpretable learning does not rely underlying model be linear. brings together reinforcement order extend reach nonparametric models large datasets with very many predictors and/or observations. Depending choice reward, can deployed offline as well online setups streaming data batches. Tailoring multiplay provide regret bounds without necessarily assuming arm mean rewards unrelated. We show strong empirical performance both simulated real data. Unlike deterministic methods nature makes less prone local convergence thereby more robust.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Variable Selection Via Gibbs Sampling

Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at http://www.jstor.org/about/terms.html. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your perso...

متن کامل

Portfolio Blending via Thompson Sampling

As a definitive investment guideline for institutions and individuals, Markowitz’s modern portfolio theory is ubiquitous in financial industry. However, its noticeably poor out-of-sample performance due to the inaccurate estimation of parameters evokes unremitting efforts of investigating effective remedies. One common retrofit that blends portfolios from disparate investment perspectives has r...

متن کامل

Stochastic Regret Minimization via Thompson Sampling

The Thompson Sampling (TS) policy is a widely implemented algorithm for the stochastic multiarmed bandit (MAB) problem. Given a prior distribution over possible parameter settings of the underlying reward distributions of the arms, at each time instant, the policy plays an arm with probability equal to the probability that this arm has largest mean reward conditioned on the current posterior di...

متن کامل

Asynchronous Parallel Bayesian Optimisation via Thompson Sampling

We design and analyse variations of the classical Thompson sampling (TS) procedure for Bayesian optimisation (BO) in settings where function evaluations are expensive, but can be performed in parallel. Our theoretical analysis shows that a direct application of the sequential Thompson sampling algorithm in either synchronous or asynchronous parallel settings yields a surprisingly powerful resul...

متن کامل

Variable Selection by Perfect Sampling

Variable selection is very important in many fields, and for its resolution many procedures have been proposed and investigated. Among them are Bayesian methods that use Markov chain Monte-Carlo (MCMC) sampling algorithms. A problem with MCMC sampling, however, is that it cannot guarantee that the samples are exactly from the target distributions. This drawback is overcome by related methods kn...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of the American Statistical Association

سال: 2021

ISSN: ['0162-1459', '1537-274X', '2326-6228', '1522-5445']

DOI: https://doi.org/10.1080/01621459.2021.1928514